Weighted and Probabilistic Context-Free Grammars Are Equally Expressive

نویسندگان

  • Noah A. Smith
  • Mark Johnson
چکیده

This paper studies the relationship between weighted context-free grammars (WCFGs), where each production is associated with a positive real-valued weight, and probabilistic contextfree grammars (PCFGs), where the weights of the productions associated with a nonterminal are constrained to sum to one. Since the class of WCFGs properly includes the PCFGs, one might expect that WCFGs can describe distributions that PCFGs cannot. However, Chi (1999) and Abney, McAllester, and Pereira (1999) proved that every WCFG distribution is equivalent to some PCFG distribution. We extend their results to conditional distributions, and show that every WCFG conditional distribution of parses given strings is also the conditional distribution defined by some PCFG, even when the WCFG’s partition function diverges. This shows that any parsing or labeling accuracy improvement from conditional estimation of WCFGs or CRFs over joint estimation of PCFGs or HMMs is due to the estimation procedure rather than the change in model class, since PCFGs and HMMs are exactly as expressive as WCFGs and chain-structured CRFs respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

The Generative Power of Probabilistic and Weighted Context-Free Grammars

Over the last decade, probabilistic parsing has become the standard in the parsing literature where one of the purposes of those probabilities is to discard unlikely parses. We investigate the effect that discarding low probability parses has on both the weak and strong generative power of context-free grammars. We prove that probabilistic context-free grammars are more powerful than their non-...

متن کامل

Extending Parikh's Theorem to Weighted and Probabilistic Context-Free Grammars

We prove an analog of Parikh’s theorem for weighted context-free grammars over commutative, idempotent semirings, and exhibit a stochastic context-free grammar with behavior that cannot be realized by any stochastic right-linear context-free grammar. Finally, we show that every unary stochastic context-free grammar with polynomially-bounded ambiguity has an equivalent stochastic right-linear co...

متن کامل

Parikh’s Theorem for Weighted and Probabilistic Context-Free Grammars

We prove an analog of Parikh’s theorem for weighted context-free grammars over commutative, idempotent semirings, and exhibit a stochastic context-free grammar with behavior that cannot be realized by any stochastic right-linear context-free grammar. Finally, we show that every unary stochastic context-free grammar with polynomially-bounded ambiguity has an equivalent stochastic right-linear co...

متن کامل

A Note on the Expressive Power of Probabilistic Context Free Grammars

We examine the expressive power of probabilistic context free grammars (PCFGs), with a special focus on the use of probabilities as a mechanism for reducing ambiguity by filtering out unwanted parses. Probabilities in PCFGs induce an ordering relation among the set of trees that yield a given input sentence. PCFG parsers return the trees bearing the maximum probability for a given sentence, dis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2007